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Abstract 

This study explored the interaction between student characteristics and the online environment in 
predicting course performance and subsequent college persistence among students in a large urban U.S. 
university system. Multilevel modeling, propensity score matching, and the KHB decomposition method 
were used. The most consistent pattern observed was that native-born students were at greater risk online 
than foreign-bom students, relative to their face-to-face outcomes. Having a child under 6 years of age 
also interacted with the online medium to predict lower rates of successful course completion online than 
would be expected based on face-to-face outcomes. In addition, while students enrolled in online courses 
were more likely to drop out of college, online course outcomes had no direct effect on college 
persistence; rather other characteristics seemed to make students simultaneously both more likely to 
enroll online and to drop out of college. 
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Introduction 

Higher education is undergoing a virtual transformation as course offerings move from in-person 
to Internet-based instruction. By 2013, over 40 million college students took online classes worldwide; 
by 2017, that number should triple (Atkins, 2013). Because of this rapid growth, many institutions have 
had to make policy decisions about online learning before adequate evidence was available. For example, 
more institutions are requiring all students to take online courses, despite evidence that this may decrease 
course and college completion for some students (Jaggars, 2011). And most colleges use surveys to 
screen out students at risk online, despite the fact that no online readiness surveys have yet been validated 
as predictors of differential online versus face-to-face outcomes (Wladis & Samuels, 2016). Online 
courses can provide increased access to college, but because they often have higher attrition (the reasons 
for which are not yet well understood), they may also be stumbling blocks to degree completion. On the 
other hand, restricting access to online courses may impede the college progress of “non-traditional” 
students who need the flexibility that online learning affords. The rapid growth of online learning will 
likely change the very nature of higher education over the coming decades. If policies guiding the 
implementation of online learning are to be grounded in research evidence, education research must keep 
up with these trends. In particular, it is essential to identify which students are at higher risk online. 
Additionally, suggesting that online students are not more likely to drop out of college immediately after, 
or due to the outcomes of the online course; rather, it seems that other student characteristics may be 
significant in determining college persistence. 

Research questions 

This study explores the relationship among student characteristics, online course-taking and 
course and college persistence. Specifically, we ask: 

1. Which student characteristics exacerbate or mitigate differences in rates of online versus face-to- 
face successful course completion? 

2. To what extent do online course outcomes explain subsequent college dropout rates? 

Theoretical framework and prior research 


Online Outcomes 

Numerous studies, including a meta-analysis of over 200 studies, have found no significant 
difference in learning outcomes in online versus face-to-face courses(e.g., Bernard et al., 2004; Bowen, 
Chingos, Lack, & Nygren, 2012). Yet online course dropout rates range from 20-40% (e.g. Pierrakeas, 
Xenos, Panagiotakopoulos, & Vergidis, 2004), and online attrition rates have been reported as 7-20 
percentage points higher than those for face-to-face courses (e.g.Nora & Snyder, 2009; Patterson & 
McFadden, 2009). However, there is little research on the effects of online course-taking on college 
persistence and completion, and what results are available are mixed (see e.g. Shea & Bidjerano, 2014; 
Xu & Jaggars, 2011). However, examining student characteristics may help to predict which students are 
at highest risk online. 

Student characteristics and online enrollment 

Online learners are more likely to be female, older, married, active military or to have other 
responsibilities (e.g., full-time work, children), and are more likely to have other “non-traditional” 
characteristics (e.g., delayed college enrollment; no high school diploma; part-time enrollment; 
financially independent) (Shea & Bidjerano, 2014; Wladis, Hachey, & Conway, 2015). Studies have also 
found that online students tend to have higher academic preparation and higher G.P.A.s, to be white, 
native English speakers, and are more likely to have applied for or received financial aid (Conway, 
Wladis, & Hachey, n.d.; Jaggers & Xu, 2010; Xu & Jaggars, 2011). Online learning also seems to attract 
a larger proportion of first generation college students (Athabasca University, 2006). However, research 
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on demographic variables is conflicting (Jones, 2010), and it is unclear how differing characteristics 
interact to affect student retention in online courses. 

Student characteristics and online outcomes 

Student skills and psychological attributes may be less predictive of online outcomes than other 
factors, or no less applicable to success online versus face-to-face. Bernard, Brauer, Abrami, and Surkes 
(2004) found that self-direction and beliefs were significant positive predictors of online course grade, but 
that G.P.A. was a stronger predictor of online course outcomes. Waschull (2005) found that self¬ 
discipline/motivation was significantly correlated with course grades online, but concluded that the same 
factors may predict success in both online and face-to-face classes. Aragon and Johnson (2008) found 
that online completers were more likely to be female, enrolled in more classes, with a higher G.P.A., but 
they found no significant difference in academic readiness or self-directed learning. 

Other investigations of student characteristics have also been inconclusive. Some gender studies found 
no differences, whereas others cite that females outperform males (for a review, see Xu & Jaggars, 2013). 
Angiello (2002) and Xu and Jaggars (2013) report differences for Hispanic and Black students in 
comparison to White students, while Welsh (2007), Aragon and Johnson (2008) and Wladis, Conway and 
Hachey (2015) found that ethnicity was not related to online course outcomes more so than face-to-face 
outcomes. G.P.A has been identified as a significant factor affecting online course outcomes in some 
studies, (e.g. Xu & Jaggars, 2013), but not others (e.g. Hachey, Wladis, & Conway, 2012). 

To accurately assess whether a factor puts a student at greater risk in the online environment specifically, 
it is essential to analyze the interaction between that factor and course medium, while simultaneously 
controlling for self-selection into online courses. Only a few studies consider these interactions, and 
while these studies controlled for some self-selection factors, all of them excluded important predictors. 
Xu and Jaggars (2013) found that Black students and students with lower G.P.A.’s did worse online than 
would be expected based on their face-to-face performance, and that women and older students did better 
than expected online. Wladis, Conway & Hachey (2015) found that older students did significantly 
better, and that women did significantly worse online, than would be expected based on their outcomes in 
comparable face-to-face courses, but that there was no significant interaction between the online medium 
and ethnicity. But neither of these studies controlled for whether a student had children, among other 
factors. 

This study addresses an important gap in the literature by considering which factors may predict 
differential online versus face-to-face performance while also controlling for a wide array of student 
characteristics related to self-selection into online courses. 

Methodology 


Data source and sample 

This study used a sample of 9,663 students with 37,442 course records from the 18 two- and four- 
year colleges in the City University of New York (CUNY) system. Students were selected if they were 
enrolled in a course in the sample frame, which consisted of online and comparable face-to-face courses 
offered during the 2014-2015 fall semester at one of the CUNY colleges. At the end of the semester, 
students in the sample were invited to participate in an online survey. The survey had a response rate of 
12.1%, which is typical for surveys of this type with this population, and responses were weighted to 
account for potential nonresponse bias (see below for details). Detailed summary statistics for the 
survey sample, broken down by course medium, can be found in Table 1. 
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Table 1. Summary statistics by course medium 


Mean/Proportion 



Fully 

Not Fully 



Online 

Online 

Overall 

Female 

74% 

69% 

70% 

Ethnicity 

2.78 

3.04 

3.01 

White 

29% 

23% 

24% 

Black 

25% 

25% 

25% 

Hispanic 

32% 

29% 

29% 

Asian or Pacific Islander 

14% 

23% 

22% 

American Indian or Native Alaskan 

0.2% 

0.4% 

0.3% 

Age 

29 

26 

26 

Child 

32% 

18% 

20% 

Child under six 

16% 

9% 

9% 

Work hrs/wk 

24.7 

15.6 

16.6 

Income 




less than $20,000/yr 

41% 

60% 

58% 

$20,000-39,999/yr 

21% 

18% 

18% 

$40,000-59,999/yr 

14% 

10% 

10% 

$60,000/yr or more 

24% 

12% 

13% 

Parental Education 




Don't know 

5% 

8% 

7% 

No HS diploma 

12% 

13% 

13% 

HS diploma 

24% 

24% 

24% 

Associate’s degree, vocational training, certificate, or some 




college 

20% 

18% 

18% 

Bachelor's degree 

20% 

21% 

21% 

Graduate/professional degree 

19% 

17% 

17% 

Developmental course placement 

32% 

45% 

43% 

Immigrant generational status 




Not born in US 

38% 

44% 

43% 

Bom in US with at least one parent not bom in US 

32% 

35% 

34% 

Bom in US and both parents bom in US 

30% 

21% 

22% 

Native English speaker 

63% 

57% 

57% 

No GPA (first-semester freshman) 

11% 

32% 

29% 

GPA (for those with a GPA) 




Under 2.0 

1% 

7% 

6% 

2.0-2.49 

9% 

11% 

11% 

2.5-2.99 

19% 

19% 

19% 

3.0-3.49 

29% 

29% 

29% 

3.5-4.0 

42% 

34% 

35% 

Number of credits in which enrolled 

11.6 

12.3 

12.2 

Level 




Community college 

16% 

39% 

37% 

Four-year 

66% 

51% 

53% 

Graduate 

18% 

9% 

10% 
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Measures 

This research utilizes two measures of student outcomes: successful course completion, or 
whether the student successfully completed a course with a C- or higher (the typical standard to receive 
major or transfer credit), and college persistence, or whether the student re-enrolled in college in the 
subsequent full semester. 

The main independent variable (IV) of interest, course medium, was dichotomized to face-to-face 
or fully online, based on Sloan Consortium definitions (Allen & Seaman, 2010). Fully online courses 
have 80% or more of the course content online, and face-to-face courses have 33% or less of the content 
online. Prior research suggests that students who take hybrid courses (33-80% online content) are 
substantially similar to students who take face-to-face courses and that the outcomes are similar (Xu & 
Jaggars, 2011). 

The other IVs in this study were chosen because there is evidence that they may: 1) predict online 
course enrollment; 2) be related to course or college outcomes; or 3) be significant predictors of outcomes 
in the online medium. Covariates included: whether the student had a child (and age of youngest child); 
gender; race/ethnicity; age; work hours; income; parental education; developmental course placement; 
mari t al/c ohabi t at i on status; immigration generational status; native speaker status; college level (two- 
year, four-year, or graduate); G.P.A; and number of credits/classes taken that semester. During 
preliminary analyses, different non-linear versions of variables were explored (e.g. converting credits to 
part-time/full-time status, squaring age), but these did not seem to model the actual distribution of the data 
any better or to produce significantly different results. 

The survey used in this study included scales measuring: motivation to complete the course; 
course enjoyment/engagement; academic integration (i.e. interaction with faculty/students outside class); 
self-directed learning skills; time management skills; preference for autonomy; and grit (i.e. perseverance 
and passion for long-term goals). These scales, to the extent possible, were based on previous 
instruments already tested for reliability and validity (Duckworth, Peterson, Matthews, & Kelly, 2007; 
Macan, Shahani, Dipboye, & Phillips, 1990; Pintrich & de Groot, 1990; U.S. Department of Education, 
Institute of Education Sciences, National Center for Education Statistics, 2009; Vallerand et al., 1992), 
but were shortened and modified for use in this study. Confirmatory factor analysis using structural 
equation modeling (SEM) was used to model items for each scale as predictors of a single latent 
construct. Error covariance terms were added between some individual items based on theory, prior to 
estimation. Some items from the motivation and grit scales were eliminated because of poor performance 
during SEM. For the final scales, average variance extracted (AVE) was 0.50 or greater, indicating 
convergent validity, and composite reliability (CR) ranged from 0.77 to 0.89, indicating good reliability 
(Flair, Anderson, Tatham, & Black, 1998); the standardized root mean square residual (SRMR) ranged 
from 0.000 to 0.059, supporting the operationalization of each scale as a single factor structure (Hu & 
Bentler, 1999). 

Analytical Approaches 

Courses for which valid grades did not exist (e.g. not submitted by instructor, course was audited) 
were dropped. Multivariate multiple imputation by chained equations was used to impute values for 
survey questions with missing responses, using all IVs chosen for subsequent analyses. Depending on 
variable type, binomial, ordered, or multinomial logit models, or predictive mean matching on three 
nearest neighbors was used for imputation. A median of 2.6% of data were missing in each imputed 
variable in the sample. After preliminary tests for stability of model estimates, final imputed datasets 
contained 35 imputations. 

Propensity scores, indicating the probability of online enrollment, were generated for each student 
using logistic regression and included all of the IVs used in the subsequent analyses; scores were 
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averaged across imputed datasets. Initially data were weighted prior to propensity score matching to 
account for survey non-response, but since it is not well established in the research literature how to best 
perform propensity score matching with weighted data (see e.g. DuGoff, Schuler, & Stuart, 2014), and 
since preliminary models with and without weights were substantially similar, subsequent analysis was 
performed without sample weights. Matched datasets were generated using single nearest-neighbor 
matching with replacement because this approach yielded the best balance on the covariates, based on the 
standardized bias for each imputed variable averaged across imputations. The median standardized bias 
across variables was 2.6%. Based on Rubin’s (2001) rule of thumb that standardized bias should be 
approximately below 25% after matching, the matched dataset achieved good balance on all covariates. 
Distribution of propensity scores was evaluated before and after matching, and both datasets showed 
significant overlap in the region of common support (see Figures 1-2). 



fully online I I not fully online 



Figure 1. Propensity scores before matching 





I I I I 


0 .2 .4 .6 

Propensity Score 


fully online 


H not fully online 


Figure 2. Propensity scores after matching 

Each dataset was formatted into two distinct datasets: a student-level dataset, in which each 
record was a single student and included information about course outcomes for that student for the 
course in the sample frame, and a course-level dataset, in which each record was the outcome of a course 
taken by one of the students in the sample (all courses taken in fall/winter by a student in the sample were 
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included). The first dataset was used to run multilevel mixed-effects logistic regression models with 
student as the first-level and course as the second-level factors, in order to control for unobserved 
heterogeneity between courses (comparing outcomes in the same course for different students); the 
second dataset was used to run multilevel models with course as the first-level and student as the second- 
level factors, in order to control for unobserved heterogeneity between students (comparing outcomes in 
online versus face-to-face courses taken by the same student). The KHB decomposition method (Kohler, 
Karlson, & Holm, 2011) was used to calculate direct and indirect effects, in order to explore the 
relationship between online course outcomes, student characteristics, and subsequent college persistence. 
Standard errors during KHB decomposition were computed using clustering by course, to account for the 
multi-level data structure. 


Results and Discussion 

This section describes factors that had a significant interaction with the online environment in 
predicting course and college outcomes. This means that the difference in outcomes online versus face- 
to-face is significantly different for distinct factor values. For example, if we say that being native-born 
put students at higher risk of dropout online, what we mean is that the change in dropout rates when 
moving from the face-to-face to online medium (all other factors being equal) is more positive for 
foreign-born than native-born students. This could mean that both foreign- and native-born students do 
worse online than face-to-face, but that the drop in performance is smaller for foreign- than native-born 
students. Or it could mean that both foreign- and native-born students do better online than face-to-face, 
but that the increase in performance is greater for foreign- than native-born students. Or it could mean 
that native-born students do worse online and foreign-bom students do better. In any of these three cases, 
foreign-born students might dropout of face-to-face or online courses at higher or lower rates than native- 
born students—the direction or significance of the interaction alone does not provide any information 
about the relative outcomes of these groups overall, just about how outcomes change for these groups 
across different course mediums. We note also that an interaction is a contrast between two or more 
groups (e.g. foreign-versus-native-bom)—for continuous factors this means that differences in outcomes 
across mediums are contrasted for higher and lower values of that factor (e.g. older versus younger 
students). 

The most consistent pattern observed in this study was that native-born students (particularly 
those with two native-born parents) were at greater risk online than foreign-born students. At CUNY 
roughly 40% of students are foreign-born. Some research has shown that cultural norms prevent certain 
immigrant groups from actively participating in face-to-face classroom discussions and that online 
discussions produced more opportunity for interaction and participation among immigrant students, so 
this is one possible explanation for these results (e.g. (Campbell, 2007; Yildiz & Bichelmeyer, 2003). 

Having a child under 6 years of age was a significant predictor of lower rates of successful course 
completion for both the matched and unmatched student-level datasets. Similar trends were observed for 
the course-level dataset, but the differences were not significant, perhaps because of relatively small 
numbers of students with pre-school-aged children in each subgroup. Repeating the analysis with a 
binary variable indicating whether the student had a child instead of whether they had a child under six 
produced similar results. It may be that student parents are more likely to enroll in online courses if they 
have greater time constraints, and that these same students are less likely to successfully complete a 
course. The fact that this pattern was significant only for the student-level dataset (where unobserved 
heterogeneity was accounted for by course and not by student), but not in the course-level dataset (where 
unobserved heterogeneity by student was accounted for), supports this interpretation. These results 
suggest that without adequate support for student parents (e.g. childcare, financial aid to reduce work 
hours), the flexibility that online courses offer may not be enough to compensate for the time demands of 
parenthood. 
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Table 1. Multi-level logistic regression model of successful course completion _ 

unmatched matched unmatched matched 

student-level student-level course-level course-level 

_OR(SE) sig. OR(SE) sig. OR(SE) sig. OR(SE) sig. 


online: motivation 

1.05 

1.03 




(0.03) 

(0.04) 



online: enj oyment 

1.03 

0.98 




(0.03) 

(0.03) 



online :ac ad integration 

1.12 

1.15 

0.97 

0.88 


(0.08) 

(0.11) 

(0.04) 

(0.05) 

online:self-directed 

1.03 

1.05 

1.005 

1.01 


(0.03) 

(0.04) 

(0.02) 

(0.03) 

online dime mgmt 

1.01 

1.004 

1.01 

0.99 


(0.04) 

(0.05) 

(0.02) 

(0.03) 

online: autonomy 

0.99 

0.98 

0.96 

0.95 


(0.04) 

(0.05) 

(0.02) 

(0.03) 

online: grit 

0.99 

0.91 

1.04 

1.04 


(0.07) 

(0.08) 

(0.04) 

(0.05) 

online:child under 6 

0.36 

0.21 * 

0.85 

0.63 


(0.21) 

(0.17) 

(0.27) 

(0.28) 

online: work hrs 

0.999 

1.01 

0.996 

0.995 


(0.01) 

(0.01) 

(0.01) 

(0.01) 

online: income 





online:$20,000-39,999/yr 

1.11 

0.42 

0.85 

0.59 


(0.57) 

(0.29) 

(0.22) 

(0.21) 

online:$40,000-59,999/yr 

1.18 

1.11 

0.94 

0.94 


(0.79) 

(0.90) 

(0.32) 

(0.43) 

online:$60,000/yr or more 

4.76 

2.90 

1.29 

0.72 


(4.01) 

(2.91) 

(0.47) 

(0.38) 

fully online x parental edu 





online:don't know 

0.78 

0.55 

0.58 

0.46 


(0.66) 

(0.62) 

(0.26) 

(0.30) 

online:ElS degree 

1.18 

0.59 

1.25 

0.70 


(0.71) 

(0.46) 

(0.41) 

(0.31) 

online:Associate’s, 

technical, 




certificate, or some college 

2.17 

1.66 

0.93 

0.57 


(1-51) 

(1.42) 

(0.31) 

(0.26) 

online:Bachelor’s 

0.96 

0.92 

1.13 

0.84 


(0.62) 

(0.74) 

(0.39) 

(0.39) 

online:grad/professional 

1.02 

0.89 

0.56 

0.52 


(0.75) 

(0.81) 

(0.20) 

(0.25) 

online: developmental 

1.39 

0.69 

1.82 

** 1.38 
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(0.62) 

(0.40) 


(0.41) 

(0.43) 

online: immigration 






online:native-born, 






at least one foreign-born parent 

0.45 

0.75 


0.63 

0.70 


(0.24) 

(0.47) 


(0.16) 

(0.24) 

online:native-born, 






no foreign-bom parents 

0.25 

* 0.19 

* 

0.47 

** 0.39 


(0.15) 

(0.14) 


(0.13) 

(0.15) 

online:married 

1.69 

3.41 

* 

1.32 

1.71 


(0.85) 

(2.12) 


(0.33) 

(0.59) 

online: female 

0.66 

1.34 


1.02 

1.22 


(0.30) 

(0.76) 


(0.23) 

(0.19) 

fully online x ethnicity 






online: Black 

1.57 

2.33 


1.03 

0.94 


(0.91) 

(1.64) 


(0.30) 

(0.37) 

online:Hispanic 

1.91 

2.17 


1.19 

0.73 


(1-13) 

(1.55) 


(0.35) 

(0.29) 

online:Asian/Paciftc Islander 

0.41 

0.60 


0.96 

0.84 


(0.29) 

(0.53) 


(0.34) 

(0.41) 

online: age 

0.9998 

1.03 


0.98 

0.97 


(0.01) 

(0.03) 


(0.01) 

(0.02) 

online:ESL 

1.82 

1.83 


0.73 

0.85 


(1.30) 

(1.54) 


(0.22) 

(0.36) 

onlinedevel 






online:four-year 

2.41 

0.85 


1.56 

1.35 


(1-10) 

(0.50) 


(0.36) 

(0.45) 

online: graduate 

0.34 

0.57 


1.05 

2.50 


(0.38) 

(0.73) 


(0.76) 

(2.24) 

online :GP A 






onlinemone 

21.4 

* 1.02 


2.27 

2.91 


(27.3) 

(1.15) 


(1-51) 

(3.65) 

online:2.0-2.49 

1.26 

0.15 

* 

1.17 

2.05 


(1.24) 

(0.14) 


(0.73) 

(2.52) 

online:2.5-2.99 

1.85 

0.17 

* 

1.63 

2.57 


(1.84) 

(0.14) 


(1.00) 

(3.14) 

online:3.0-3.49 

6.29 

0.62 


1.55 

2.16 


(6.52) 

(0.51) 


(0.95) 

(2.63) 

online:3.5 and above 

10.1 

* 


1.65 

1.75 


(11-7) 



(1.05) 

(2.16) 

online: credits 

0.99 

1.03 


0.99 

0.98 


(0.06) 

(0.08) 


(0.03) 

(0.04) 


• p<0.10, * p<0.05, ** p<0.01, *** p<0.001 
OR = odds ratio; SE = standard error 

Only coefficients for interactions with online medium are reported here. 
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College persistence 

This study also explored the extent to which the subsequent college persistence of online students 
could be directly related to the outcomes of their online courses, and to what extent it is likely related to 
other characteristics which also increase the likelihood of taking an online course. Online students in this 
study were significantly less likely to persist in college by re-enrolling in courses at the university in the 
subsequent semester but it is unclear whether this is related to their online course-taking or whether this is 
the result of factors that make these students both more likely to enroll online and more likely to drop or 
stop out of college. The KHB decomposition method was used to calculate the direct, indirect, and total 
effect of taking a fully online course on subsequent college persistence as mediated by successful course 
completion, while controlling for the variables included as covariates in Tables 1-3. In Table 4, the 
direct, indirect, and total effects of this model can be seen: there is no significant indirect effect, 
suggesting that online students are not more likely to drop out of college immediately after, or due to, the 
outcomes of the online course; rather, it seems that other student characteristics may be significant in 


Table 4. Direct, indirect, and total effect of fully online course medium on subsequent college persistence 
as mediated by successful course completion, controlling for all covariates in Tables 1-2_ 


Effect 

Coef. 

SE 

P 

Total 

-0.483 

0.267 

0.071 

Direct 

-0.501 

0.267 

0.061 

Indirect 

0.018 

0.012 

0.146 


determining simultaneously online course enrollment and college persistence. 


Limitations 

This study analyzes data from a large U.S. university system in order to increase generalizability 
and validity, but still has some limitations. While the sample size in this study was large, not all 
subgroups were large. When considering interaction effects, as in this study, it is not necessarily the 
whole sample size that is relevant, but the size of particular subgroups. For example, only 25 students 
with children under six years old dropped the course. Because of this, there may be important 
relationships for some of these smaller subgroups that were not identified as significant in this study, but 
that would be identified as significant in a larger sample. 

In addition, while the CUNY system is highly diverse and likely generalizable to a wider U.S. 
student population, it is not necessarily nationally representative. CUNY does not have rural campuses, 
so caution should be exercised before extending any results taken from the CUNY dataset to the 
approximately 18% of U.S. college students who attend rural colleges (1PEDS, 2013). There may be 
factors that impact U.S. rural online students that are not well captured in this study. 

In addition, the CUNY dataset used in this study was also more diverse than the average US 
college student population, with a higher proportion of ethnic and racial minorities, foreign-bom students, 
first-generation college students, students from lower socio-economic strata, and students requiring 
developmental coursework. While these features may make the samples used in this study less 
representative of the U.S. college population as a whole, they also make this data an excellent resource for 
investigating the relationship of online course-taking and college outcomes for many groups that have 
been traditionally underrepresented in college and at higher risk of dropout in the US. 

And finally, while this study has attempted to control for a wide array of factors that may 
correlate with online course enrollment or college outcomes, it is unlikely that any study could include 
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them all. Online students are more likely to have more “complicated” lives that include experiences that 
are difficult to measure or quantify, but that influence both decisions to enroll online and subsequent 
course and college outcomes. Further exploring and refining factors that may impact online course 
enrollment should be a focus of future educational research if we are to conduct well-controlled 
observational studies about online outcomes. 

Implications and Conclusion 

Colleges wanting to target interventions to students at highest risk in the online environment may 
want to focus on supporting student parents (perhaps by providing financial support and/or assistance 
accessing childcare), and native-born students in areas where foreign-born students are heavily 
represented. But while these are the groups found by this study to be most vulnerable in the online 
environment specifically, these groups are not necessarily the ones with the poorest absolute online 
outcomes. For example, for the dataset used in this study, household income was strongly correlated with 
course and college outcomes even though it was not relevant to the online environment specifically. 
Lower-income students likely still need significant support in online courses, just as they do in face-to- 
face classes. In addition to targeting student groups that are vulnerable in the online environment 
specifically, colleges hoping to improve online retention should continue to support student groups that 
have historically been identified as at-risk generally. 

Furthermore, in this study, online course outcomes had no direct effect on college persistence. 
This suggests that taking online courses likely does not lead directly to lower rates in college persistence 
on average, but rather that there are characteristics that lead students to both enroll in online college and 
drop out of college at higher rates. Further research with specific subgroups (e.g. community college 
students) and with other samples is necessary in order to confirm the extent to which this pattern is 
generalizable. 
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